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CROSS-REFERENCE TO RELATED APPLICATIONS 

[0001] This application claims the benefit of U.S. Provisional Application No. 60/413,373, 
filed September 25, 2002, which is incorporated herein by reference in its entirety. 

STATEMENT REGARDING FEDERALLY-SPONSORED 
RESEARCH AND DEVELOPMENT 

[0002] This invention was made with Government support under Grant No. DA/DAAG55- 
98-1-0315 and DA/DAAD19-01-1-0705, Disclosure Z02232, awarded by the Army Re- 
search Office. The Government has certain rights in this invention. 

FIELD OF THE INVENTION 

[0003] The present invention relates to data transmission. More particularly, it relates to 

error-correcting codes for data transmission. 

BACKGROUND OF THE INVENTION 

[0004] Codes on graphs have become a topic of great current interest in the coding the- 
ory community. The prime examples of codes on graphs are low-density parity-check 
(LDPC) codes that have been widely considered as next-generation error-correcting 
codes for many real world applications in digital communication and magnetic stor- 
age. However, because of their distinct properties, LDPC codes decoder/encoder design 
and implementation are not trivial and design of these codes stills remain a challenging 
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task. How well one attacks this problem directly determines the extent of LDPC ap- 
plication in the real world. As a class of LDPC codes, (3,k)-regular LDPC codes can 
achieve very good performance and hence are considered in this invention. 

[0005] The main challenge when implementing the message passing algorithm for de- 
coding LDPC codes is managing the passing of the messages. The realization of the 
message passing bandwidth results in very different and difficult challenges depending 
on whether all the messages are passed in fully parallel or partly parallel manner. By 
fully exploiting the parallelism of the message passing decoding algorithm, a fully par- 
allel decoder can achieve very high decoding throughput but suffers from prohibitive 
implementation complexity (see, e.g., A. J. Blanksby and C. J. Rowland, "A 690-mW 
1-Gb/s 1024-b, rate- 1/2 low-density parity-check code decoder", IEEE Journal of Solid- 
State Circuits, vol. 37, pp. 404-412 (March, 2002)). Furthermore, the large number of 
interconnection may limit the speed performance and increase the power dissipation. 
Thus the fully parallel design strategy is only suitable to short code length scenarios. 

[0006] In partly parallel decoding, the computations associated with a certain number of 
variable nodes or check nodes are time-multiplexed to a single processor. Meanwhile, 
since the computation associated with each node is not complicated, the fully parallel 
interconnection network should be correspondingly transformed to partly parallel ones 
to achieve both the communication complexity reduction and high-speed partly parallel 
decoding. Unfortunately, the randomness of the Tanner graph makes it nearly impossi- 
ble to develop such a transformation. In other words, an arbitrary random LDPC code 
has little chance to be suited for high-speed partly parallel decoder hardware implemen- 
tation. 

[0007] Furthermore, to perform LDPC encoding, the generator matrix is typically used, 
which has quadratic complexity in the block length. How to reduce the encoding com- 
plexity for the practical coding system implementation is another crucial issue. 
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[0008] What is needed is new joint code-encoder-decoder design methodology and tech- 
niques for designing practical LDPC coding system, that overcomes the limitations of 
the conventional code first scheme and/or designs. 

BRIEF SUMMARY OF THE INVENTION 

[0009] The present invention provides a joint code and decoder design approach to con- 
struct good (3,k)-regular LDPC codes that exactly fit to partly parallel decoder and ef- 
ficient encoder implementations. A highly regular partly parallel decoder architecture 
design is developed. A systematic efficient encoding scheme is presented to significantly 
reduce the encoder implementation complexity. 

[0010] In accordance with the present invention, LDPC coding system is designed using a 
joint code-encoder-decoder methodology. First a method is developed to explicitly con- 
struct a high-girth (girth is the length of a shortest cycle in a graph) (2,k)-regular LDPC 
code that exactly fits to a high-speed partly parallel decoder. Then this (2,k)-regular 
LDPC decoder is extended to a (3,k)-regular LDPC partly parallel decoder that is con- 
figured by a set of constrained random parameters. This decoder defines a (3,k)-regular 
LDPC code ensemble from which a good (3,k)-regular LDPC code can be selected 
based on the criterion of fewer short cycles and computer simulations. Due to certain 
unique structure properties of such (3,k)-regular LDPC codes, an efficient systematic 
encoding scheme is developed to reduce the encoding complexity. Since each code in 
such a code ensemble is actually constructed by randomly inserting certain check nodes 
into the deterministic high-girth (2,k)-regular LDPC code under the constraint specified 
by the decoder, the codes in this ensemble more likely do not contain too many short 
cycles and hence a good code can be easily selected from these codes. 

[0011] The (3,k)-regular LDPC code-encoder-decoder design of the present invention 
can be used to design and implement LDPC coding system for a wide variety real- 
world applications that require excellent error-correcting performance and high decod- 
ing speed/low power consumption, such as deep space and satellite communication, 
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optical links, and magnetic or holographic storage systems, etc. These codes can also 
be used for fixed and mobile wireless systems, ultra wide-band systems for personal area 
networks and other applications, and wireless local area networks containing wireless 
receivers with one or more antennas. 

[0012] It is an advantage of the joint code-encoder-decoder design of the present inven- 
tion that it effectively exploits the LDPC code construction flexibility to improve the 
overall LDPC coding system implementation performance. 

[0013] Further embodiments, features, and advantages of the present invention, as well 
as the structure and operation of the various embodiments of the present invention are 
described in detail below with reference to accompanying drawings. 

BRIEF DESCRIPTION OF THE DRAWINGS/FIGURES 

[0014] The present invention is described with reference to the accompanying figures. In 
the figures, like reference numbers indicate identical or functionally similar elements. 
Additionally, the left-most digit or digits of a reference number identify the figure in 
which the reference number first appears. The accompanying figures, which are incor- 
porated herein and form part of the specification, illustrate the present invention and, 
together with the description, further serve to explain the principles of the invention and 
to enable a person skilled in the relevant art to make and use the invention. 

[0015] Fig. 1 illustrate the block diagram of a (3,k)-regular LDPC code and decoder/encoder 
joint design flow according to the invention. 

[0016] Fig. 2 illustrates the structures of submatrices Hi and H 2 . 

[0017] Fig. 3 illustrates the (2,k)-regular LDPC partly parallel decoder architecture. 

[0018] Fig. 4 illustrates the (3,k)-regular LDPC partly parallel decoder architecture. 

[0019] Fig. 5 illustrates the flayer shuffle network. 

[0020] Fig.6 illustrates the structure of an alternative (3,k)-regular LDPC partly parallel 
decoder 

[0021] Fig. 7 illustrates the structure of matrix H en for efficient encoding. 
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DETAILED DESCRIPTION OF THE INVENTION 



Background on LDPC Code and Message Passing Decoding Algorithm 

[0022] An LDPC code is defined as the null space of a very sparse M x N parity check 
matrix, and is typically represented by a bipartite graph (a bipartite graph is one in which 
the nodes can be partitioned into two disjoint sets) usually called a Tanner graph, be- 
tween N variable (or message) nodes in one set and M check (or constraint) nodes in 
another set. An LDPC code is called (j, fc)-regular if each variable node has a degree 
of j and each check node has a degree of fc. The construction of an LDPC code (or its 
Tanner graph) is typically random. LDPC codes can be effectively decoded by the itera- 
tive message passing. The structure of the message passing decoding algorithm directly 
matches the Tanner graph: decoding messages are iteratively computed by all the vari- 
able nodes and check nodes and exchanged through the edges between the neighboring 
nodes. It is well known that the message passing decoding algorithm works well if the 
underlying Tanner graph does not contain too many short cycles. Thus, the random Tan- 
ner graph is typically restricted to be 4-cycle free, which is easy to achieve. However, 
the construction of random Tanner graphs free of higher order cycles, e.g., 6 and 8, is 
not trivial. 

[0023] Before presenting the message passing algorithm, some definitions should be in- 
troduced first: Let H denote the M x N parity check matrix and H itj denote the entry 
of H at the position (i, j). Define the set of bits n that participate in parity check m as 
Af(m) = {n : H m , n = 1}, and the set of parity checks m in which bit n participates as 
M (n) = {m : iJ m) „ = 1 }. Let M{m) \ n denote the set M{m) with bit n excluded, and 
M(n)\m denote the set M(n) with parity check m excluded. 
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[0024] Iterative message passing Decoding Algorithm 

Input: The channel observations = P(x n = 0) andp l n = P(x n = 1) = 1 - p®, 

Output: Hard decision x = {x\, • • * , 5^}; 
Procedure: 

(a) Initialization: For each n, compute the channel message 7 n = log §f. and /or eac/i 
(m,n) € = 1}, compute 

^S f n(7 n )log(tanh(|7 n |/2)), where sign(y n ) = < 



+1 7n>0 
-1 7n<0 



Iterative Decoding 

• Horizontal (or check node processing) step: For each (m,n) € {(£,.?) = 



1}, compute 

Pm,n = log(tanh(a/2)) JJ sisn(v), EQ.(l) 

n'€-/V(m)\n 

• Vertical (or variable node processing) step: For each (m, n) € {(*, j)l^*J ~ 
1}, compute 

«»n(7m,n) log(tanh(| 7m , n |/2)), EQ.(2) 

where 7 m?n = 7 n + Em'6>M(n)\m &*',n- For each n, update the "pseudo- 
posterior log-likelihood ratio (LLR)" A n as: 

A„ = 7n+ £ EQ - (3) 

m6M(n) 

• Decision step. 

i. Perform hard decision on {Ai , • • • , Ajv} to obtain x = {#1, • • • , x n} such 
that x n = 0 z/An > 0 and £ n = 1 if A < 0; 



ii /f H x = 0, then algorithm terminates, else go to Horizontal step. A 
failure will be declared if preset maximum number of iterations occurs 
without successful decoding. ■ 



[0025] In the above algorithm, a m>n and /3 m>n are called variable-to-check messages 

and check-to-variable messages, respectively. Each check node computation is realized 
by a Check Node processing Unit (CNU) to compute the check-to-variable message 
Pm,n according to EQ. (1), each variable node computation is realized by Variable Node 
processing Unit (VNU) to compute the variable-to-check message a m>n and pseudo- 
posterior LLR A n according to EQ. (2) and EQ. (3), respectively, and generate x n by 
performing hard decision on A n . 

Joint (3,k)-Regular LDPC Code and Decoder/Encoder Design 

[0026] It is well known that the message passing algorithm for LDPC decoding works 
well if the underlying Tanner graph is 4-cycle free and does not contain too many short 
cycles. Thus the essential objective of this joint design approach is to construct LDPC 
codes that not only fit to practical decoder/encoder implementations but also have large 
average cycle length in their 4-cycle free Tanner graphs. 

[0027] Given a graph G, let g u denote the length of the shortest cycle that passes through 
node u in graph G, then YlueG g u /N is denoted as girth average of G, where N = |G| 
is the total node number of G. Girth average can be used as an effective criterion for 
searching good LDPC code over one code ensemble. Fig. 1 illustrates the schematic 
diagram of the joint design approach which is briefly described below. 

(a) Step 100: Explicitly construct the two matrices, Hi and H 2) so that H = [Hf , H2 ] T 
defines a (2,k)-regular LDPC code denoted as C2; 
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(b) Step 102: Obtain an elaborated (3,k)-regular LDPC decoder architecture which de- 
fines a random (3,k)-regular LDPC code ensemble and each code in this ensemble 
is a sub-code of C2; 

(c) Step 104: Use the decoder to randomly generate a certain number of (3,k)-regular 
LDPC codes from which one can select one code with good performance by girth 
average comparison and computer simulations; 

(d) Step 106: Let H denote the parity check matrix of the selected code. Introduce an 
explicit column permutation 7r c to generate an approximate upper triangular matrix 
H en = 7r c (H) based on which an efficient encoding scheme is obtained. 



Construction of Hi and H 2 

[0028] A method is developed to construct matrix H = [Hf, H^] T which defines a (2, k)- 
regular LDPC code with girth of 12. Such construction method leads to a very simple 
decoder architecture and provides more freedom on the code length selection: Given fc, 
any code length that could be factored as L • k 2 is permitted, where L can not be factored 
as L = a - b, Va, b € {0, * • - , k - 1}. 

[0029] Fig. 2 shows the structures of Hi and H 2 . Each block matrix I XtV in Hi is an L x L 
identity matrix and each block matrix P XjV in H 2 is obtained by a cyclic shift of an L x L 
identity matrix. Let T denote the right cyclic shift operator where T*(U) represents right 
cyclic shifting matrix U by i columns, then P x>y = 7^(1) where u = ((x - 1) -y) mod L 
and I represents the L x L identity matrix, e.g., let L = 5, x = 3 and y — 4, then 
u = (x - 1) • y mod L = 8 mod 5 = 3, and 

0 0 0 1 0 



P 3 , 4 = T 3 (I) = 



0 0 0 0 1 

1 0 0 0 0 
0 10 0 0 
0 0 10 0 
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Clearly, matrix H = [Hf , H^] T defines a (2, fc)-regular LDPC code with L • k 2 variable 
nodes and 2L • k check nodes. Let G denote the corresponding Tanner graph. The 
proposed (2,k) code has the property that if L can not be factored as L = a • 6, where 
a, b E {0, • • • , k - 1}, then the girth of G is 12 and there is at least one 12-cycle passing 
each check node. 

(2,k)-Regular LDPC Decoder Architecture 

[0030] A partly parallel decoder architecture is developed for the (2,k)-regular LDPC 
code constructed in the above. To facilitate the following description, L • k 2 variable 
nodes of the code are arranged into k 2 variable node (VG) groups, each group VG XtV 
(1 < x,y < k) contains the L variable nodes corresponding to the L columns in H 
going through the block matrix I XiV and P X)3r Notice that any two variable nodes in the 
same group never connect to the same check node and all the k variable nodes connected 
to the same check node as specified by Hi (H 2 ) always come from k groups with the 
same x-index (j/-index). Based on the above observations, a partly parallel decoder 
architecture can be directly constructed, as shown in Fig. 3. 

[0031] This decoder contains k 2 memory banks, denoted as MEM BANK-(x, y) for 1 < 
x,y < k. MEM BANK-(#, y) stores all the p-bit channel messages in RAM (random 
access memory) /, q-b\t variable-to-check and check-to-variable messages in RAMs E\ 
and E 2 and hard decisions in RAM C associated with the L variable nodes in VG^. In 
this decoder, the check-to-variable message /3 mjTl and variable-to-check message a m>n 
associated with each pair of neighboring nodes alternatively occupy the same memory 
location. The two check-to-variable or variable-to-check messages associated with each 
variable node are stored in E x and E2 with the same address. This decoder completes 
each decoding iteration in 2L clock cycles, and in each clock cycle it performs: 

(a) In each memory bank, if all the check-to-variable messages P m>n associated with 
one variable node become available after previous clock cycle, then 
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i. Retrieve 1 channel message j n from RAM / and 2 check-to-variable messages 
/? m>n associated with this variable node from RAMs El and E2; 

ii. VNU computes 2 variable-to-check messages a m>n and LLR A n , and obtains 
x n by performing hard decision on A n ; 

iii. Store the 2 variable-to-check messages a m , n back to RAM El and E2 and x n 
to RAM C. 

(b) Retrieve k 2 variable-to-check messages a m>n and hard decisions x n (obtained in 
the previous iteration) from the k 2 memory banks at the addresses provided by 
AGi's; 

(c) Shuffle the k 2 variable-to-check messages and hard decision by a one layer shuffle 
network 300. This shuffle network is configured by the configuration bit c_i lead- 
ing to a fixed permutation (or re-ordering pattern), denoted by 7T_i, if c_i = 1, or 
to the identity permutation, denoted by Id, if c_i = 0. 

(d) Each CNUj computes k check-to- variable messages /3 m?n and performs the parity 
check on the k hard decisions x n ; 

(e) Unshuffle the k 2 check-to- variable messages /? m>n and store them back into the k 2 
memory banks at the original locations. 

[0032] To realize the connectivity specified by Hi and H 2 in the 1 st and 2 nd L clock 

cycles, respectively, this decoder has the following features: 

• Each Address Generator (AG XjV ) provides memory address to El and E2 during 
the I st and 2 nd L clock cycles, respectively. Each AG^y is a modulo-L binary 
counter which is set to initial value D x , y every L clock cycles, Le.> at r = 0, L, 
where 



D XtV = < 



0, r = 0, 

EQ.(4) 

((x - 1) • y) mod L, r = L. 
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• During the I st L clock cycles, the configuration bit c_i = 1 and 7r_i permute input 
data {x 0} • • • , z fc 2_i} to {as»_ l(0 ), • • • , x^v-i)}, where 

tt_i (i) = {i mod fc) • fc + Lr J ■ EQ.(5) 

• During the 2 nd L clock cycles, the configuration bit c_i = 0 and the shuffle net- 
work is bypassed. 



(3,k)-Regular LDPC Decoder Architecture 

[0033] Fig. 4 illustrates the (3,k)-regular LDPC partly parallel decoder obtained by intro- 
ducing a few new blocks into the (2,k)-regular LDPC decoder. Configured by a set of 
constrained random parameters, this decoder defines a (3,k)-regular LDPC code ensem- 
ble in which each code has L • k 2 variable nodes and 3L • k check nodes. 

[0034] In this decoder, a Random Permutation Generator (RPG) 400 and a flayer shuffle 
network 402 are inserted between the 1-layer shuffle network (7r_i or Id) and all the 
CNUs. Fig. 5 shows the structure of the #-layer shuffle network: each layer is a single 
layer shuffle network and configured by c» leading to a given permutation ^ if a = 1 
(7r- ), or to the identity permutation (Id=7r l °) otherwise. Thus, configured by the gAAl 
word c = (c 5 _i, • • • , 00)2 generated by RPG, the overall permutation pattern n in each 
clock cycle is the product of g permutations: w = n c g 9 Si o • - • o ttJ 0 . 

[0035] This decoder completes each decoding iteration in 3L clock cycles. CNUs access 
the messages stored in El, E2 and E3 in the l s \ 2 nd and 3 rd L clock cycles, respec- 
tively. Denote the 2L * k check nodes associated with the messages stored in El and 
E2 as deterministic check nodes and the other L • k check nodes associated with the 
messages stored in E3 as random check nodes. 
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[0036] During the first 2L clock cycles, this decoder bypasses the #-layer shuffle network 
by setting the output of RPG as a zero vector, and works in the exactly same way as 
the (2,k)-regular LDPC decoder. In other words, this decoder realizes the connectivity 
specified by H = [Hf ,H2] T between all variable nodes and the 2L • k deterministic 
check nodes during the first 2L clock cycles. 

[0037] During the last L clock cycles, this decoder realizes the connectivity between all 

variable nodes and the L • k random check nodes according to the following specifica- 
tions. 

• RPG performs as a hash function /: {2L, ■ • • , 3L - 1} — ► {0, • • • , 2 9 - 1} and 
its 5-bit output vector c configures the ^-layer shuffle network; 

• The configuration bit c_i = 0 so that the 1 -layer shuffle network performs the 
identity permutation; 

• AG Xj y provides address to E3 with the counter set to D XtV = t XtV at r = 2L, where 

t X)2 ,e{o,...,L-i}. 

[0038] In order to guarantee that this code ensemble only contains 4-cycle free codes and 
facilitate the design process, the hash function / and the #-layer shuffle network are 
generated randomly and the value of each t X)V is chosen randomly under the following 
constraints: 

(a) Given x, t xm ^ t XiV2 , Vyi, y 2 6 {1, • ■ • , k}; 

(b) Given y 9 t xuy - t X2<y {{x x - x 2 ) • y) mod L, Va?i, x 2 £ {1, • • • , k}. 

The code ensemble defined by the proposed decoder is referred as implementation- 
oriented (3,k)-regular LDPC code ensemble. For real applications, a good code is se- 
lected from this code ensemble by two steps: first randomly generate a certain number 
of implementation-oriented (3, /c)-regular LDPC codes, then pick few codes with high 
girth averages and finally select the one leading to the best code performance simulation 
results. 



[0039] One unique property of this work is to use the counter for memory address genera- 
tion, which largely simplifies the decoder hardware implementation complexity and im- 
proves the throughput performance compared with the previous design solution (see, E. 
Boutillon and J. Castura and F. R. Kschischang, "Decoder-First Code Design", proceed- 
ings of the 2nd International Symposium on Turbo Codes and Related Topics, pp. 459- 
462 (Sept. 2000)) in which more complex random number generators are used to gen- 
erate the memory access address. 



An Alternative (3,k)-Regular LDPC Decoder Architecture 

[0040] The architecture shown in Fig. 4 demands 3L clock cycles to complete one decod- 
ing iteration. As an alternative design solution, Fig. 6 shows the principal structure of 
a partly parallel decoder that requires only 2L clock cycles for one iteration. It mainly 
contains k 2 PE Blocks PE^ for 1 < x 7 y < k, three bi-directional shuffle networks 7Ti, 
7r 2 and 7T 3 and 3 • k CNU's. Each ?E XiV contains one memory bank RAMs^ that stores 
all the decoding information associated with all the L variable nodes in the variable 
node group VG Xj2/ , and contains one VNU to perform the variable node computations 
for these L variable nodes. Each bi-directional shuffle network tti realizes the decod- 
ing information exchange between all the L • k 2 variable nodes and the L • k check 
nodes corresponding to Hi. The k CNU^/s for j - 1, ■ • • , k perform the check node 
computations for all the L • k check nodes corresponding to 

[0041] This decoder completes each decoding iteration in 2L clock cycles, and during the 
1 st and 2 nd L clock cycles, it works in check node processing mode and variable node 
processing mode, respectively. In the check node processing mode, the decoder not 
only performs the computations of all the check nodes but also completes the decoding 
information exchange between neighboring nodes. In variable node processing mode, 
the decoder only performs the computations of all the variable nodes. 
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Efficient Encoding Scheme 

[0042] The straightforward encoding scheme for LDPC codes, using the generator ma- 
trix, has quadratic complexity in the block length, which is prohibitive with respect to 
implementation complexity. Based on the specific structure of the parity check matrix 
of the proposed (3,k)-regular LDPC codes, a systematic approach is proposed for its ef- 
ficient encoding. The basic idea is: First obtain an approximate upper triangular matrix 
H en = 7r c (H) by introducing an explicit column permutation 7r c , and then obtain x by 
performing efficient encoding based on H en , and finally get the codeword x = ^(x). 

[0043] The parity check matrix of the developed (3,k)-regular LDPC code has the form: 
H = [Hf , H3 ] T , where Hi and H 2 are shown in Fig. 2. The submatrix consisting 
of all the columns of H which go through the block matrix l XfV in Hi is denoted as 
H (x,v) , e.g., H (1,2) as shown in Fig. 1. A column permutation 7r c is introduced to move 
each H^ forward to the position just right to H^ 1 ) where x increases from 3 to k 
successively. Because each Pi ?y is an identity matrix, the matrix H en = 7r c (H) has the 
structure as shown in Fig. 7, based on which one can write matrix H en in block matrix 



form as 

T B D 



H en = 



ACE 



EQ.(6) 



[0044] The efficient encoding is carried out based on the matrix H en . Let x = (x a , x b , x c ) 
be a tentative codeword decomposed according to (6), where x c contains the information 
bits of length N — M + r\ redundant bits x a and x b are of length (2k - 1) • L and 
(k + 1) • L - r, respectively. The encoding process is outlined as follows: 

(a) Compute y c = D * x c and z c = E x c , which is efficient because both D and E 
are sparse; 

(b) Solve T • x' a = y c . Since T has the form as shown in Fig. 7, it can be proved that 
T" 1 = T. Thus x' a = T • 2/c which can be easily computed since T is sparse; 
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(c) Evaluate s = A • x' a + z c , which is also efficient since A is sparse; 

(d) Compute x b = G • s, where G = (A • T • B + C)" 1 . In this step, the complexity 
is proportional to ((Jfc + 1) • L - r) 2 ; 

(e) Finally one can obtain x G by solving T • x a = B ■ x b + y c . Since T" 1 = T, 
x a = T • (B • x fc + y c ). This is efficient since both T and B are sparse. 

[0045] The real codeword is obtained as x = ir~ l (it). The information bits on the 

decoder side can be easily obtained by performing the column permutation 7r c on the 
decoder output. 

Conclusions 

[0046] A joint (3,k)-regular LDPC code and decoder/encoder design approach has been 
presented. By jointly considering the good LDPC code construction and practical de- 
coder/encoder VLSI implementation, this successful attacks the practical (3,k)-regular 
LDPC coding system design and implementation problems for real-world applications. 
It will be understood by those skilled in the art that various changes in form and de- 
tails can be made therein without departing from the spirit and scope of the invention 
as defined in the appended claims. Thus, the breadth and scope of the present inven- 
tion should not be limited by any of the above-described exemplary embodiments, but 
should be defined only in accordance with the following claims and their equivalents. 
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